feat: add SHA-1 idempotency primitives for CoreFileObject#963
feat: add SHA-1 idempotency primitives for CoreFileObject#963
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o start Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a no-network, no-mutation primitive that callers can use to compare a local bytes/Path/BinaryIO source against the server-stored SHA-1 checksum on a CoreFileObject node, without triggering a transfer. Also adds MATCHES_LOCAL_CHECKSUM_FEATURE_NOT_SUPPORTED_MESSAGE constant and re-exports it from infrahub_sdk/node/__init__.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds InfrahubNode.upload_if_changed() which composes sha1_of_source, upload_from_path/upload_from_bytes, and save() to perform uploads only when local content differs from the server-side checksum. Returns an UploadResult with the locally-computed digest as the post-upload checksum. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirror the async InfrahubNode.upload_if_changed on the sync class, extending TestUploadIfChanged to parametrize over both client types (standard + sync) for all 6 test scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a `skip_if_unchanged: bool = False` kwarg to `InfrahubNode.download_file`. When True and dest is provided, SHA-1 of the local file is compared against the node's server checksum; a match returns 0 immediately without a network request. Includes @overload signatures and 5 new parametrized tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ircuit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror the async InfrahubNode.download_file skip-if-unchanged logic on InfrahubNodeSync, including overloads. Extend TestDownloadSkipIfUnchanged to parametrize over both client types (53 total tests pass). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deploying infrahub-sdk-python with
|
| Latest commit: |
5269fb6
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://3d1a5475.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://feat-idempotent-file-ops.infrahub-sdk-python.pages.dev |
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## stable #963 +/- ##
==========================================
+ Coverage 80.77% 80.87% +0.10%
==========================================
Files 132 119 -13
Lines 10999 10433 -566
Branches 1681 1578 -103
==========================================
- Hits 8884 8438 -446
+ Misses 1566 1470 -96
+ Partials 549 525 -24
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 15 files with indirect coverage changes 🚀 New features to boost your workflow:
|
ajtmccarty
left a comment
There was a problem hiding this comment.
I think this looks good, so approving. comments are little suggestions and a higher-level question that is probably more a reflection of my SDK-ignorance than any issues here
| @@ -0,0 +1 @@ | |||
| Fixed `InfrahubNode.download_file(skip_if_unchanged=True)` on an unsaved node: previously, a node whose local `dest` happened to match the node's in-memory `checksum.value` could silently return `0` without a network round-trip. The method now enforces the saved-node check before the skip-check short-circuit, raising `ValueError("Cannot download file for a node that hasn't been saved yet.")` as it does in the non-skip path. | |||
There was a problem hiding this comment.
is there a way to write this so that is covers how the bug would affect the user?
There was a problem hiding this comment.
Rewritten in user-impact terms in 18e1572 — the new wording foregrounds the "success-looking 0 masks an unsaved node" user-visible failure mode instead of describing the internal reorder.
| value). | ||
| """ | ||
|
|
||
| uploaded: bool |
There was a problem hiding this comment.
was_uploaded makes it more clear that this is Boolean reflecting whether something happened or not
There was a problem hiding this comment.
Renamed to was_uploaded across the dataclass, both async+sync upload_if_changed return sites, docstrings, tests, and the PR description in 4e2d5a9.
| class TestUploadResult: | ||
| def test_carries_uploaded_and_checksum(self) -> None: | ||
| result = UploadResult(uploaded=True, checksum="abc123") | ||
| assert result.uploaded is True | ||
| assert result.checksum == "abc123" | ||
|
|
||
| def test_checksum_optional(self) -> None: | ||
| result = UploadResult(uploaded=False, checksum=None) | ||
| assert result.checksum is None | ||
|
|
||
| def test_is_frozen(self) -> None: | ||
| result = UploadResult(uploaded=True, checksum="abc") | ||
| with pytest.raises(AttributeError): # FrozenInstanceError is an AttributeError on all supported versions | ||
| result.uploaded = False # type: ignore[misc] |
There was a problem hiding this comment.
I don't know if this is really necessary
There was a problem hiding this comment.
Agreed — dropped test_is_frozen in 93fe705. You were right that it was asserting @dataclass(frozen=True)'s contract rather than anything this PR adds.
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class UploadResult: |
There was a problem hiding this comment.
good use of a dataclass instead of a tuple 👍
| # No HTTP request should have been issued. | ||
| assert httpx_mock.get_requests() == [] | ||
|
|
||
| async def test_uploads_when_checksum_differs( |
There was a problem hiding this comment.
should this one include the httpx_mock and check if the upload was sent?
There was a problem hiding this comment.
Added a positive-path assertion in 9b8ebc1 — test_uploads_when_checksum_differs now verifies via httpx_mock.get_requests() that the update mutation POST actually fired, instead of just trusting was_uploaded is True.
| client_type: str, | ||
| clients: BothClients, | ||
| file_object_schema: NodeSchemaAPI, | ||
| mock_node_create_with_file: HTTPXMock, |
There was a problem hiding this comment.
should this be checked to verify the upload was sent?
There was a problem hiding this comment.
Covered by the same change in 9b8ebc1 — the test now asserts the update mutation actually fires via httpx_mock.get_requests(), not just the return value.
There was a problem hiding this comment.
it still doesn't look like this test asserts anything about the sent requests
| else: | ||
| result = node.upload_if_changed(source=target) | ||
|
|
||
| assert result.uploaded is True |
There was a problem hiding this comment.
should the name be asserted somehow? truly not sure
There was a problem hiding this comment.
Skipping at this layer — the existing test_node_create_with_file_uses_multipart / test_node_update_with_file_uses_multipart tests already assert the filename in the multipart body, and upload_if_changed delegates to upload_from_path unchanged, so a name assertion here would duplicate coverage. Happy to add one explicitly at the upload_if_changed layer if you'd rather have it visible in this test class.
| skip_if_unchanged: When ``True``, compute the SHA-1 of the file at | ||
| ``dest`` (which must be provided) and compare against the | ||
| node's server checksum. If they match, return ``0`` without | ||
| hitting the network. |
There was a problem hiding this comment.
I'm not an SDK expert, but I am not sure how you know if a remote file has changed without downloading it
There was a problem hiding this comment.
Good catch — the docstrings didn't make this explicit. Clarified in 5269fb6 across UploadResult, matches_local_checksum, and both download_file(skip_if_unchanged=...) docstrings: the comparison uses the checksum attribute as loaded when the node was fetched via client.get(...). If the server replaces the file after the fetch, callers need to re-fetch the node to refresh checksum before the skip-check can see the change.
| clients: BothClients, | ||
| file_object_schema: NodeSchemaAPI, | ||
| tmp_path: Path, | ||
| mock_download_file_to_disk: HTTPXMock, # existing fixture |
There was a problem hiding this comment.
same question about whether this should be verified in the test
There was a problem hiding this comment.
Also covered in 9b8ebc1 — test_downloads_when_local_differs now checks httpx_mock.get_requests() for exactly one GET to /api/storage/files/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…otency tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| @@ -0,0 +1 @@ | |||
| Fixed `InfrahubNode.download_file(skip_if_unchanged=True)` silently returning `0` bytes-written on nodes that had never been saved. Previously, if the local file at `dest` happened to have the same SHA-1 as whatever was cached on the unsaved node's `checksum.value`, callers got back a success-looking `0` result instead of a clear failure — which could mask the fact that the node had no server counterpart at all. Calling `download_file` on an unsaved node now raises `ValueError` consistently, whether or not `skip_if_unchanged` is set. | |||
There was a problem hiding this comment.
correct me if I'm wrong, but skip_if_unchanged did not exist before this PR. I still don't really understand what this is saying, but I don't think it can be "fixed" and reference a new kwarg
| client_type: str, | ||
| clients: BothClients, | ||
| file_object_schema: NodeSchemaAPI, | ||
| mock_node_create_with_file: HTTPXMock, |
There was a problem hiding this comment.
it still doesn't look like this test asserts anything about the sent requests
Summary
Adds SHA-1 compare-and-skip primitives to
InfrahubNode/InfrahubNodeSyncso downstream libraries stop maintaining parallel copies of this logic. Four public additions:sha1_of_source(source: bytes | Path | BinaryIO) -> strininfrahub_sdk.file_handler— streaming SHA-1 helper (64 KiB chunks; rewindsBinaryIOto original position). Canonical import path:from infrahub_sdk.file_handler import sha1_of_source.matches_local_checksum(source) -> bool— atomic compare against the node's server-storedchecksumattribute. No network, no mutation.upload_if_changed(source, name=None) -> UploadResult— composes compare +upload_from_*+save(). Returns a frozenUploadResult(was_uploaded, checksum)dataclass. Skips the transfer when the local SHA-1 matches.download_file(..., skip_if_unchanged=True)— short-circuits the download whendestexists on disk with a matching SHA-1. Returns0bytes written on skip.Both async and sync twins. Full async↔sync symmetry.
Motivation
Both opsmill/nornir-infrahub#71 and opsmill/infrahub-ansible#317 shipped near-identical
hashlib.sha1(...).hexdigest()+ compare logic. Centralising it in the SDK gives one source of truth and — because nornir-infrahub already extended idempotency to the download side — brings that capability to the Ansible collection too.Usage — nornir-infrahub example
Concretely, the nornir-infrahub tasks in
nornir_infrahub/plugins/tasks/file_object.pycollapse from ~30 lines of hand-rolled idempotency into a single method call per operation.Upload
Before (today, on nornir-infrahub
main):After (with this PR):
No
_sha1helper, nohashlibimport, no two-branchupload_from_path/upload_from_bytesdispatch — the SDK handles the compare, the staging, and the save in one call. The returnedUploadResult.was_uploadedmaps directly to Nornir'sResult(changed=...).Download
Before (today):
After:
Two branches collapse into one, the manual SHA-1 compare and file-exists handling move into the SDK, and the
changedsignal falls out of the return value — no more tracking it imperatively across branches.Notes
upload_if_changedcomputessha1_of_source(source)once up front and reuses it for both the compare and the returnedUploadResult.checksum. This is the semantically correct value (the server storessha1(received_bytes), which equals the local digest), and it avoids double-hashing or reading aBinaryIOtwice.download_file(skip_if_unchanged=True)validates the saved-node precondition before the skip-check, so an unsaved node with a coincidentally-matchingchecksum.valuestill raisesValueErrorrather than silently returning0. See thefixcommit in the branch and the accompanying.fixed.mdchangelog fragment.*_FEATURE_NOT_SUPPORTED_MESSAGEconstants were added alongside the existingFILE_DOWNLOAD_FEATURE_NOT_SUPPORTED_MESSAGE, following the same convention.Follow-ups (not in this PR)
Once this lands and a release ships with the new lower bound:
infrahub-sdkfloor and replace local SHA-1 code innornir_infrahub/plugins/tasks/file_object.pywith the new methods (per the usage example above).infrahub-sdkfloor and replace local SHA-1 code inplugins/module_utils/node.py(get_file_object_local_checksum,_update_object_with_file) with the new methods. Also gains download-side idempotency for free by wiringskip_if_unchanged=Trueinto theobject_file_fetchaction.Test plan
InfrahubNodeandInfrahubNodeSync(84 file-related tests acrosstest_file_handler.pyandtest_file_object.py, up from ~41 baseline)ruff checkcleanmypycleanAdded+Fixedfragments validateChangelog fragments
changelog/+idempotent-file-ops.added.md— describes the four new primitiveschangelog/+idempotent-file-ops-unsaved-node.fixed.md— describes the unsaved-node guard indownload_file(skip_if_unchanged=True)